Goto

Collaborating Authors

 provable suboptimality


On the Provable Suboptimality of Momentum SGD in Nonstationary Stochastic Optimization

Sahu, Sharan, Hogan, Cameron J., Wells, Martin T.

arXiv.org Machine Learning

In this paper, we provide a comprehensive theoretical analysis of Stochastic Gradient Descent (SGD) and its momentum variants (Polyak Heavy-Ball and Nesterov) for tracking time-varying optima under strong convexity and smoothness. Our finite-time bounds reveal a sharp decomposition of tracking error into transient, noise-induced, and drift-induced components. This decomposition exposes a fundamental trade-off: while momentum is often used as a gradient-smoothing heuristic, under distribution shift it incurs an explicit drift-amplification penalty that diverges as the momentum parameter $β$ approaches 1, yielding systematic tracking lag. We complement these upper bounds with minimax lower bounds under gradient-variation constraints, proving this momentum-induced tracking penalty is not an analytical artifact but an information-theoretic barrier: in drift-dominated regimes, momentum is unavoidably worse because stale-gradient averaging forces systematic lag. Our results provide theoretical grounding for the empirical instability of momentum in dynamic settings and precisely delineate regime boundaries where vanilla SGD provably outperforms its accelerated counterparts.


Search-Based Planning with Provable Suboptimality Bounds for Continuous State Spaces

Gonzalez, Juan Pablo (General Dynamics Robotic Systems) | Likhachev, Maxim (Carnegie Mellon University)

AAAI Conferences

Search-based planning is widely used for mobile robot motion planning because of its guarantees of optimality and completeness. In continuous state-spaces, however, most existing approaches have significant limitations in terms of optimality and completeness because of the underlying grid used. We propose an approach that eliminates the dependency on grids by using more general equivalence classes to quickly find an initial solution and instead of pruning states that fall within an equivalence class and have higher cost, we use an inflated heuristic to lower the priority of these states in the search. In further iterations, we reduce the inflated heuristic in a principled way, thus providing fast solutions with provable suboptimality bounds that can be improved as time allows. The proposed approach produces smooth paths with the resolution dictated by the action set. Finer action sets produce higher resolution paths that are more computationally intensive to calculate and coarser action sets produce lower resolution paths that are faster to compute. To the best of our knowledge, this is the first algorithm that is able to plan in continuous state-spaces with provable guarantees on completeness and bounds on suboptimality for a given action set. Experimental results on 3D (x,y,theta) path planning show that, on average, this approach is able to find paths in less than two seconds that are within 2% of the optimal path cost in worlds of up to 1000x1000 m with a minimum step size of one meter.